AITopics | mt model

Collaborating Authors

mt model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

What Triggers my Model? Contrastive Explanations Inform Gender Choices by Translation Models

Hackenbuchner, Janiça, Tezcan, Arda, Daems, Joke

arXiv.org Artificial IntelligenceDec-10-2025

Interpretability can be implemented as a means to understand decisions taken by (black box) models, such as machine translation (MT) or large language models (LLMs). Yet, research in this area has been limited in relation to a manifested problem in these models: gender bias. With this research, we aim to move away from simply measuring bias to exploring its origins. Working with gender-ambiguous natural source data, this study examines which context, in the form of input tokens in the source sentence, influences (or triggers) the translation model choice of a certain gender inflection in the target language. To analyse this, we use contrastive explanations and compute saliency attribution. We first address the challenge of a lacking scoring threshold and specifically examine different attribution levels of source words on the model gender decisions in the translation. We compare salient source words with human perceptions of gender and demonstrate a noticeable overlap between human perceptions and model attribution. Additionally, we provide a linguistic analysis of salient words. Our work showcases the relevance of understanding model translation decisions in terms of gender, how this compares to human decisions and that this information should be leveraged to mitigate gender bias.

artificial intelligence, computational linguistic, natural language, (13 more...)

arXiv.org Artificial Intelligence

2512.0844

Country: Europe > United Kingdom > England (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.88)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

TransAlign: Machine Translation Encoders are Strong Word Aligners, Too

Ebing, Benedikt, Goldschmied, Christian, Glavaš, Goran

arXiv.org Artificial IntelligenceNov-3-2025

In the absence of sizable training data for most world languages and NLP tasks, translation-based strategies such as translate-test -- evaluating on noisy source language data translated from the target language -- and translate-train -- training on noisy target language data translated from the source language -- have been established as competitive approaches for cross-lingual transfer (XLT). For token classification tasks, these strategies require label projection: mapping the labels from each token in the original sentence to its counterpart(s) in the translation. To this end, it is common to leverage multilingual word aligners (WAs) derived from encoder language models such as mBERT or LaBSE. Despite obvious associations between machine translation (MT) and WA, research on extracting alignments with MT models is largely limited to exploiting cross-attention in encoder-decoder architectures, yielding poor WA results. In this work, in contrast, we propose TransAlign, a novel word aligner that utilizes the encoder of a massively multilingual MT model. We show that TransAlign not only achieves strong WA performance but substantially outperforms popular WA and state-of-the-art non-WA-based label projection methods in MT-based XLT for token classification.

artificial intelligence, computational linguistic, natural language, (15 more...)

arXiv.org Artificial Intelligence

2510.27337

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Data Fusion of Deep Learned Molecular Embeddings for Property Prediction

Appleton, Robert J, Barnes, Brian C, Strachan, Alejandro

arXiv.org Artificial IntelligenceOct-29-2025

Data - driven approaches such as deep learning can result in predictive models for material properties with exceptional accuracy and efficiency. However, in many applications, data is sparse, severely limiting their accuracy and applicability . To improve predictions, techniques such as transfer learning and multi - task learning have been used. T he performance of multi - task learning models depend s on the strength of the underlying correlations between tasks and the completeness of the dataset . S tandard multi - task models tend to underperform when trained on sparse datasets with weakly correlated properties. To address this gap, we fuse deep - learned embeddings generated by independent pre - trained single - task models, resulting in a multi - task model that inherit s rich, property - specific representations. By re - using (rather than re - training) these embeddings, the resulting fused model outperforms standard multi - task models and can be extended with fewer trainable parameters . We demonstrate this technique on a widely used benchmark dataset of quantum chemistry data for small molecules as well as a newly compiled sparse dataset of experimental data collected from literature and our own quant um chemistry and thermochemical calculations.

artificial intelligence, distribution statement, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1021/acs.jcim.5c01728

2504.07297

Country: North America > United States > Indiana (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Government > Military (0.68)
Government > Regional Government (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GMU Systems for the IWSLT 2025 Low-Resource Speech Translation Shared Task

Meng, Chutong, Anastasopoulos, Antonios

arXiv.org Artificial IntelligenceMay-29-2025

This paper describes the GMU systems for the IWSLT 2025 low-resource speech translation shared task. We trained systems for all language pairs, except for Levantine Arabic. We fine-tuned SeamlessM4T-v2 for automatic speech recognition (ASR), machine translation (MT), and end-to-end speech translation (E2E ST). The ASR and MT models are also used to form cascaded ST systems. Additionally, we explored various training paradigms for E2E ST fine-tuning, including direct E2E fine-tuning, multi-task training, and parameter initialization using components from fine-tuned ASR and/or MT models. Our results show that (1) direct E2E fine-tuning yields strong results; (2) initializing with a fine-tuned ASR encoder improves ST performance on languages SeamlessM4T-v2 has not been trained on; (3) multi-task training can be slightly helpful.

artificial intelligence, machine translation, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.21781

Country:

North America > United States (0.28)
Europe > Austria (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Parallel Corpora for Machine Translation in Low-resource Indic Languages: A Comprehensive Review

Raja, Rahul, Vats, Arpita

arXiv.org Artificial IntelligenceMar-2-2025

Parallel corpora play an important role in training machine translation (MT) models, particularly for low-resource languages where high-quality bilingual data is scarce. This review provides a comprehensive overview of available parallel corpora for Indic languages, which span diverse linguistic families, scripts, and regional variations. We categorize these corpora into text-to-text, code-switched, and various categories of multimodal datasets, highlighting their significance in the development of robust multilingual MT systems. Beyond resource enumeration, we critically examine the challenges faced in corpus creation, including linguistic diversity, script variation, data scarcity, and the prevalence of informal textual content.We also discuss and evaluate these corpora in various terms such as alignment quality and domain representativeness. Furthermore, we address open challenges such as data imbalance across Indic languages, the trade-off between quality and quantity, and the impact of noisy, informal, and dialectal data on MT performance. Finally, we outline future directions, including leveraging cross-lingual transfer learning, expanding multilingual datasets, and integrating multimodal resources to enhance translation quality. To the best of our knowledge, this paper presents the first comprehensive review of parallel corpora specifically tailored for low-resource Indic languages in the context of machine translation.

corpora, dataset, translation, (14 more...)

arXiv.org Artificial Intelligence

2503.04797

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Indonesia > Bali (0.04)
(30 more...)

Genre: Overview (1.00)

Industry:

Education (0.67)
Government (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Are All Spanish Doctors Male? Evaluating Gender Bias in German Machine Translation

Kappl, Michelle

arXiv.org Artificial IntelligenceFeb-28-2025

We present WinoMTDE, a new gender bias evaluation test set designed to assess occupational stereotyping and underrepresentation in German machine translation (MT) systems. Building on the automatic evaluation method introduced by arXiv:1906.00591v1, we extend the approach to German, a language with grammatical gender. The WinoMTDE dataset comprises 288 German sentences that are balanced in regard to gender, as well as stereotype, which was annotated using German labor statistics. We conduct a large-scale evaluation of five widely used MT systems and a large language model. Our results reveal persistent bias in most models, with the LLM outperforming traditional systems. The dataset and evaluation code are publicly available under https://github.com/michellekappl/mt_gender_german.

artificial intelligence, machine translation, natural language, (13 more...)

arXiv.org Artificial Intelligence

2502.19104

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Government > Regional Government (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Token-level Ensembling of Models with Different Vocabularies

Wicks, Rachel, Ravisankar, Kartik, Yang, Xinchen, Koehn, Philipp, Post, Matt

arXiv.org Artificial IntelligenceFeb-28-2025

Model ensembling is a technique to combine the predicted distributions of two or more models, often leading to improved robustness and performance. For ensembling in text generation, the next token's probability distribution is derived from a weighted sum of the distributions of each individual model. This requires the underlying models to share the same subword vocabulary, limiting the applicability of ensembling, since many open-sourced models have distinct vocabularies. In research settings, experimentation or upgrades to vocabularies may introduce multiple vocabulary sizes. This paper proposes an inference-time only algorithm that allows for ensembling models with different vocabularies, without the need to learn additional parameters or alter the underlying models. Instead, the algorithm ensures that tokens generated by the ensembled models \textit{agree} in their surface form. We apply this technique to combinations of traditional encoder-decoder models and decoder-only LLMs and evaluate on machine translation. In addition to expanding to model pairs that were previously incapable of token-level ensembling, our algorithm frequently improves translation performance over either model individually.

computational linguistic, proceedings, translation, (14 more...)

arXiv.org Artificial Intelligence

2502.21265

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
(21 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Efficient Machine Translation Corpus Generation: Integrating Human-in-the-Loop Post-Editing with Large Language Models

Yuksel, Kamer Ali, Gunduz, Ahmet, Anees, Abdul Baseet, Sawaf, Hassan

arXiv.org Artificial IntelligenceFeb-18-2025

This paper introduces an advanced methodology for machine translation (MT) corpus generation, integrating semi-automated, human-in-the-loop post-editing with large language models (LLMs) to enhance efficiency and translation quality. Building upon previous work that utilized real-time training of a custom MT quality estimation metric, this system incorporates novel LLM features such as Enhanced Translation Synthesis and Assisted Annotation Analysis, which improve initial translation hypotheses and quality assessments, respectively. Additionally, the system employs LLM-Driven Pseudo Labeling and a Translation Recommendation System to reduce human annotator workload in specific contexts. These improvements not only retain the original benefits of cost reduction and enhanced post-edit quality but also open new avenues for leveraging cutting-edge LLM advancements. The project's source code is available for community use, promoting collaborative developments in the field. The demo video can be accessed here.

large language model, natural language, translation, (18 more...)

arXiv.org Artificial Intelligence

2502.12755

Country:

North America (0.46)
Asia (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

SpeechT: Findings of the First Mentorship in Speech Translation

Moslem, Yasmin, Morán, Juan Julián Cea, Gonzalez-Gomez, Mariano, Farouq, Muhammad Hazim Al, Abdou, Farah, Deb, Satarupa

arXiv.org Artificial IntelligenceFeb-17-2025

This work presents the details and findings of the first mentorship in speech translation (SpeechT), which took place in December 2024 and January 2025. To fulfil the requirements of the mentorship, the participants engaged in key activities, including data preparation, modelling, and advanced research.

artificial intelligence, natural language, translation, (14 more...)

arXiv.org Artificial Intelligence

2502.1205

Country:

Europe (0.93)
North America > United States (0.29)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

LoCoML: A Framework for Real-World ML Inference Pipelines

Maddireddy, Kritin, Methukula, Santhosh Kotekal, Sridhar, Chandrasekar, Vaidhyanathan, Karthik

arXiv.org Artificial IntelligenceJan-23-2025

The widespread adoption of machine learning (ML) has brought forth diverse models with varying architectures, and data requirements, introducing new challenges in integrating these systems into real-world applications. Traditional solutions often struggle to manage the complexities of connecting heterogeneous models, especially when dealing with varied technical specifications. These limitations are amplified in large-scale, collaborative projects where stakeholders contribute models with different technical specifications. To address these challenges, we developed LoCoML, a low-code framework designed to simplify the integration of diverse ML models within the context of the \textit{Bhashini Project} - a large-scale initiative aimed at integrating AI-driven language technologies such as automatic speech recognition, machine translation, text-to-speech, and optical character recognition to support seamless communication across more than 20 languages. Initial evaluations show that LoCoML adds only a small amount of computational load, making it efficient and effective for large-scale ML integration. Our practical insights show that a low-code approach can be a practical solution for connecting multiple ML models in a collaborative environment.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2501.14165

Country: Asia > India > Telangana > Hyderabad (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.89)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.69)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.67)

Add feedback